Background

This module extends code contained in Coronavirus_Statistics_v004.Rmd to include sourcing of all key functions and parameters. This file includes the latest code for analyzing all-cause death data from CDC Weekly Deaths by Jurisdiction. CDC maintains data on deaths by week, age cohort, and state in the US. Downloaded data are unique by state, epidemiological week, year, age, and type (actual vs. predicted/projected).

These data are known to have a lag between death and reporting, and the CDC back-correct to report deaths at the time the death occurred even if the death is reported in following weeks. This means totals for recent weeks tend to run low (lag), and the CDC run a projection of the expected total number of deaths given the historical lag times. Per other analysts on the internet, there is currently significant supra-lag, with lag times much longer than historical averages causing CDC projected deaths for recent weeks to be low.

The code leverages tidyverse and sourced functions throughout:

# All functions assume that tidyverse and its components are loaded and available
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.3     v purrr   0.3.4
## v tibble  3.1.1     v dplyr   1.0.6
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
# If the same function is in both files, use the version from the more specific source
source("./Generic_Added_Utility_Functions_202105_v001.R")
source("./Coronavirus_CDC_Excess_Functions_v001.R")

Running Code

The main function is readRunCDCAllCause(), which performs multiple tasks:

STEP 0: Optionally, downloads the latest data file from CDC STEP 1: Reads and processes a data file has been downloaded from CDC to local
STEP 2: Extract relevant data from a processed state-level COVID Tracking Project list
STEP 3: Basic plots of the CDC data
STEP 4: Basic excess-deaths analysis
STEP 5: Create cluster-level aggregate plots
STEP 6: Create state-level aggregate plots
STEP 7: Create age-cohort aggregate plots
STEP 8: Returns a list of key data frames, modeling objects, named cluster vectors, etc.

The functions are tested on previously downloaded data:

cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20210623.csv"
cdcList_20210703 <- readRunCDCAllCause(loc=cdcLoc, 
                                       weekThru=17, 
                                       lst=readFromRDS("cdc_daily_210528"), 
                                       dlData=FALSE, 
                                       stateNoCheck=c("NC"), 
                                       pdfCluster=TRUE, 
                                       pdfAge=TRUE
                                       )
## 
## Parameter cvDeathThru has been set as: 2021-05-01 
## 
## 
##  *** Data suppression checks *** 
## # A tibble: 2 x 6
##   noCheck state problem curWeek     n deaths
##   <lgl>   <chr> <lgl>   <lgl>   <int>  <dbl>
## 1 TRUE    NC    TRUE    FALSE      72     NA
## 2 TRUE    NC    TRUE    TRUE        6     NA
## # A tibble: 2 x 3
##   noCheck curWeek     n
##   <lgl>   <lgl>   <int>
## 1 TRUE    FALSE      72
## 2 TRUE    TRUE        6
## 
## 
## Data suppression checks passed
## 
## 
## *** File has been checked for uniqueness by: state year week age 
## 
## Rows: 91,537
## Columns: 12
## $ fullState  <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala~
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10~
## $ state      <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",~
## $ year       <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,~
## $ week       <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,~
## $ age        <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8~
## $ period     <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015~
## $ Type       <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted ~
## $ Suppress   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ n          <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ deaths     <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,~
## $ Note       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## 
## Check Control Levels and Record Counts for Processed Data:
## 
## 
## Checking variable combination: age 
## # A tibble: 6 x 4
##   age                    n n_deaths_na  deaths
##   <fct>              <dbl>       <dbl>   <dbl>
## 1 Under 25 years     10735           0  369164
## 2 25-44 years        13656           0  902390
## 3 45-64 years        16793           0 3549786
## 4 65-74 years        16783           0 3558139
## 5 75-84 years        16790           0 4401133
## 6 85 years and older 16780           0 5681860
## 
## 
## Checking variable combination: period year Type 
## # A tibble: 7 x 6
##   period    year  Type                     n n_deaths_na  deaths
##   <fct>     <fct> <chr>                <dbl>       <dbl>   <dbl>
## 1 2015-2019 2015  Predicted (weighted) 14364           0 2691180
## 2 2015-2019 2016  Predicted (weighted) 14445           0 2723236
## 3 2015-2019 2017  Predicted (weighted) 14404           0 2801986
## 4 2015-2019 2018  Predicted (weighted) 14400           0 2830372
## 5 2015-2019 2019  Predicted (weighted) 14415           0 2844025
## 6 2020      2020  Predicted (weighted) 14837           0 3433405
## 7 2021      2021  Predicted (weighted)  4672           0 1138268
## 
## 
## Checking variable combination: period Suppress 
## # A tibble: 3 x 5
##   period    Suppress     n n_deaths_na   deaths
##   <fct>     <chr>    <dbl>       <dbl>    <dbl>
## 1 2015-2019 <NA>     72028           0 13890799
## 2 2020      <NA>     14837           0  3433405
## 3 2021      <NA>      4672           0  1138268
## 
## 
## Checking variable combination: period Note 
## # A tibble: 9 x 5
##   period   Note                                            n n_deaths_na  deaths
##   <fct>    <chr>                                       <dbl>       <dbl>   <dbl>
## 1 2015-20~ <NA>                                        72028           0  1.39e7
## 2 2020     Data in recent weeks are incomplete. Only ~ 13194           0  2.96e6
## 3 2020     Data in recent weeks are incomplete. Only ~   531           0  2.31e5
## 4 2020     Weighted numbers of deaths are 20% or more~   280           0  6.00e4
## 5 2020     Weights may be too low to account for unde~    18           0  9.85e3
## 6 2020     <NA>                                          814           0  1.69e5
## 7 2021     Data in recent weeks are incomplete. Only ~  4469           0  1.10e6
## 8 2021     Data in recent weeks are incomplete. Only ~    14           0  9.65e2
## 9 2021     Data in recent weeks are incomplete. Only ~   189           0  3.58e4

## 
## *** File has been checked for uniqueness by: cluster year week

## 
## Plots will be run after excluding stateNoCheck states

## 
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2021w17.pdf

## 
## Returning plot outputs to the main log file

## Joining, by = "state"

## 
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2021w17.pdf

## 
## Returning plot outputs to the main log file

The latest data are downloaded and processed:

cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20210708.csv"
cdcList_20210708 <- readRunCDCAllCause(loc=cdcLoc, 
                                       weekThru=22, 
                                       lst=readFromRDS("cdc_daily_210708"), 
                                       stateNoCheck=c("NC", "AK", "WV"), 
                                       pdfCluster=TRUE, 
                                       pdfAge=TRUE
                                       )
## 
## Parameter cvDeathThru has been set as: 2021-06-05 
## 
## 
##  *** Data suppression checks *** 
## # A tibble: 4 x 6
##   noCheck state problem curWeek     n deaths
##   <lgl>   <chr> <lgl>   <lgl>   <int>  <dbl>
## 1 TRUE    AK    TRUE    FALSE       2     NA
## 2 TRUE    NC    TRUE    FALSE     102     NA
## 3 TRUE    NC    TRUE    TRUE        6     NA
## 4 TRUE    WV    TRUE    TRUE        2     NA
## # A tibble: 2 x 3
##   noCheck curWeek     n
##   <lgl>   <lgl>   <int>
## 1 TRUE    FALSE     104
## 2 TRUE    TRUE        8
## 
## 
## Data suppression checks passed
## 
## 
## *** File has been checked for uniqueness by: state year week age 
## 
## Rows: 92,880
## Columns: 12
## $ fullState  <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala~
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10~
## $ state      <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",~
## $ year       <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,~
## $ week       <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,~
## $ age        <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8~
## $ period     <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015~
## $ Type       <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted ~
## $ Suppress   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ n          <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ deaths     <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,~
## $ Note       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## 
## Check Control Levels and Record Counts for Processed Data:
## 
## 
## Checking variable combination: age 
## # A tibble: 6 x 4
##   age                    n n_deaths_na  deaths
##   <fct>              <dbl>       <dbl>   <dbl>
## 1 Under 25 years     10890           0  374959
## 2 25-44 years        13868           0  919211
## 3 45-64 years        17038           0 3605423
## 4 65-74 years        17027           0 3615820
## 5 75-84 years        17033           0 4467166
## 6 85 years and older 17024           0 5757892
## 
## 
## Checking variable combination: period year Type 
## # A tibble: 7 x 6
##   period    year  Type                     n n_deaths_na  deaths
##   <fct>     <fct> <chr>                <dbl>       <dbl>   <dbl>
## 1 2015-2019 2015  Predicted (weighted) 14364           0 2691176
## 2 2015-2019 2016  Predicted (weighted) 14443           0 2723213
## 3 2015-2019 2017  Predicted (weighted) 14408           0 2802027
## 4 2015-2019 2018  Predicted (weighted) 14400           0 2830376
## 5 2015-2019 2019  Predicted (weighted) 14414           0 2844003
## 6 2020      2020  Predicted (weighted) 14838           0 3432903
## 7 2021      2021  Predicted (weighted)  6013           0 1416773
## 
## 
## Checking variable combination: period Suppress 
## # A tibble: 3 x 5
##   period    Suppress     n n_deaths_na   deaths
##   <fct>     <chr>    <dbl>       <dbl>    <dbl>
## 1 2015-2019 <NA>     72029           0 13890795
## 2 2020      <NA>     14838           0  3432903
## 3 2021      <NA>      6013           0  1416773
## 
## 
## Checking variable combination: period Note 
## # A tibble: 10 x 5
##    period   Note                                           n n_deaths_na  deaths
##    <fct>    <chr>                                      <dbl>       <dbl>   <dbl>
##  1 2015-20~ <NA>                                       72029           0  1.39e7
##  2 2020     Data in recent weeks are incomplete. Only~ 13459           0  3.04e6
##  3 2020     Data in recent weeks are incomplete. Only~     5           0  1.24e2
##  4 2020     Data in recent weeks are incomplete. Only~   262           0  1.57e5
##  5 2020     Weighted numbers of deaths are 20% or mor~   280           0  6.00e4
##  6 2020     Weights may be too low to account for und~    10           0  5.95e3
##  7 2020     <NA>                                         822           0  1.73e5
##  8 2021     Data in recent weeks are incomplete. Only~  5631           0  1.34e6
##  9 2021     Data in recent weeks are incomplete. Only~    24           0  2.00e3
## 10 2021     Data in recent weeks are incomplete. Only~   358           0  7.15e4

## 
## *** File has been checked for uniqueness by: cluster year week

## 
## Plots will be run after excluding stateNoCheck states

## 
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2021w22.pdf

## 
## Returning plot outputs to the main log file

## Joining, by = "state"

## 
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2021w22.pdf

## 
## Returning plot outputs to the main log file

saveToRDS(cdcList_20210708)

The function readProcessCDC() is updated to allow for more control in zeroing out (rather than erroring) where there is a small number of data suppression:

# Function to check for CDC excess suppression
checkCDCSuppression <- function(df, stateNoCheck, errTotAllowed=20, errMaxAllowed=round(errTotAllowed/2)) {
    
    # Categorize the potential issues in the file (note to suppress or NA deaths)
    checkProblems <- df %>% 
        mutate(problem=(!is.na(Suppress) | is.na(deaths)), 
               noCheck=state %in% all_of(stateNoCheck)
               )
    
    # Print a list of the problems, excluding those in stateNoCheck
    cat("\nRows in states to be checked that have NA deaths or a note for suppression:\n")
    checkProblems %>%
        filter(problem, !noCheck) %>%
        arrange(desc(year), desc(week)) %>%
        select(state, weekEnding, year, week, age, Suppress, deaths) %>%
        as.data.frame() %>%
        print()
    
    # Summarize the problems
    cat("\n\nProblems by state:\n")
    checkProblems %>%
        group_by(noCheck, state, problem) %>%
        summarize(n=n(), deaths=specNA(sum)(deaths), .groups="drop") %>%
        filter(problem) %>%
        print()
    
    # Assess the amount of error
    errorState <- checkProblems %>%
        filter(problem, !noCheck) %>%
        count(state)
    
    # Error out if threshold for error by state OR total errors exceeded
    errMax <- errorState %>% pull(n) %>% max()
    errTot <- errorState %>% pull(n) %>% sum()
    cat("\n\nThere are", errTot, "rows with errors; maximum for any given state is", errMax, "errors\n")
    
    if ((errTot > errTotAllowed) | (errMax > errMaxAllowed)) {
        stop("\nToo many errors; thresholds are ", errTotAllowed, " total and ", errMaxAllowed, " maximum\n")
    }
    
}



plotQCReadProcessCDC <- function(df, 
                                 ckCombos=list(c("age"), c("period", "year", "Type"), 
                                               c("period", "Suppress"), c("period", "Note")
                                               )
                                 ) {
    
    # Create dataset for analysis
    df <- df %>% 
        mutate(n=1, n_deaths_na=ifelse(is.na(deaths), 1, 0))
    
    # Check control totals by specified combinaions
    purrr::walk(ckCombos, .f=function(x) {
        cat("\n\nChecking variable combination:", x, "\n")
        checkControl(df, groupBy=x, useVars=c("n", "n_deaths_na", "deaths"), fn=specNA(sum))
        }
        )
    
    # Plot deaths by state
    p1 <- checkControl(df, 
                       groupBy=c("state"), 
                       useVars=c("deaths"), 
                       fn=specNA(sum), 
                       printControls=FALSE, 
                       pivotData=FALSE
                       ) %>%
        ggplot(aes(x=fct_reorder(state, deaths), y=deaths)) + 
        geom_col(fill="lightblue") + 
        geom_text(aes(y=deaths, label=paste0(round(deaths/1000), "k")), hjust=0, size=3) + 
        coord_flip() +
        labs(y="Total deaths", x=NULL, title="Total deaths by state in all years in processed file")
    print(p1)
    
    # Plot deaths by week/year
    p2 <- checkControl(df, 
                       groupBy=c("year", "week"), 
                       useVars=c("deaths"), 
                       fn=specNA(sum), 
                       printControls=FALSE, 
                       pivotData=FALSE
                       ) %>%
        ggplot(aes(x=week, y=deaths)) + 
        geom_line(aes(group=year, color=year)) + 
        labs(title="Deaths by year and epidemiological week", x="Epi week", y="US deaths") + 
        scale_color_discrete("Year") + 
        lims(y=c(0, NA))
    print(p2)
    
}



# Function to read and process raw CDC all-cause deaths data
readProcessCDC <- function(fName, 
                           weekThru,
                           periodKeep=cdcExcessParams$periodKeep,
                           fDir="./RInputFiles/Coronavirus/",
                           col_types=cdcExcessParams$colTypes, 
                           renameVars=cdcExcessParams$remapVars,
                           maxSuppressAllowed=20, 
                           stateNoCheck=c()
                           ) {
    
    # FUNCTION ARGUMENTS:
    # fName: name of the downloaded CDC data file
    # weekThru: any record where week is less than or equal to weekThru will be kept
    # periodKeep: any record where period is in periodKeep will be kept
    # fDir: directory name for the downloaded CDC data file
    # col_types: variable type by column in the CDC data (passed to readr::read_csv())
    # renameVars: named vector for variable renaming of type c("Existing Name"="New Name")
    # maxSuppressAllowed: maximum number of data suppressions (must be in current week/year) to avoid error
    # stateNoCheck: vector of states that do NOT have suppression errors thrown
    
    # STEP 1: Read the CSV data
    cdcRaw <- fileRead(paste0(fDir, fName), col_types=col_types)
    # glimpse(cdcRaw)
    
    # STEP 2: Rename the variables for easier interpretation
    cdcRenamed <- cdcRaw %>%
        colRenamer(vecRename=renameVars) %>%
        colMutater(selfList=list("weekEnding"=lubridate::mdy))
    # glimpse(cdcRenamed)
    
    # STEP 3: Convert to factored data
    cdcFactored <- cdcRenamed %>%
        colMutater(selfList=list("age"=factor), levels=cdcExcessParams$ageLevels) %>%
        colMutater(selfList=list("period"=factor), levels=cdcExcessParams$periodLevels) %>%
        colMutater(selfList=list("year"=factor), levels=cdcExcessParams$yearLevels)
    # glimpse(cdcFactored)
    
    # STEP 4: Filter the data to include only weighted deaths and only through the desired time period
    cdcFiltered <- cdcFactored %>%
        rowFilter(lstFilter=list("Type"="Predicted (weighted)")) %>%
        filter(period %in% all_of(periodKeep) | week <= weekThru)
    # glimpse(cdcFiltered)
    
    # STEP 4a: Check that all suppressed data and NA deaths have been eliminated
    cat("\n\n *** Data suppression checks *** \n")
    checkCDCSuppression(cdcFiltered, stateNoCheck=stateNoCheck, errTotAllowed=maxSuppressAllowed)
    cat("\n\nData suppression checks passed\n\n")
    
    # STEP 5: Remove any NA death fields, delete the US record, convert YC to be part of NY
    cdcProcessed <- cdcFiltered %>%
        rowFilter(lstExclude=list("state"=c("US", "PR"), "deaths"=c(NA))) %>%
        mutate(state=ifelse(state=="YC", "NY", state), 
               fullState=ifelse(state %in% c("NY", "YC"), "New York State (NY plus YC)", fullState)
               ) %>%
        group_by(fullState, weekEnding, state, year, week, age, period, Type, Suppress) %>%
        arrange(!is.na(Note)) %>%
        summarize(n=n(), deaths=sum(deaths), Note=first(Note), .groups="drop") %>%
        ungroup() %>%
        checkUniqueRows(uniqueBy=c("state", "year", "week", "age"))
    glimpse(cdcProcessed)
    
    # STEP 5a: Check control levels for key variables in processed file
    cat("\nCheck Control Levels and Record Counts for Processed Data:\n")
    plotQCReadProcessCDC(cdcProcessed)

    # STEP 6: Return the processed data file
    cdcProcessed
    
}

The data are processed using the updated function:

cdcLoc <- "Weekly_counts_of_deaths_by_jurisdiction_and_age_group_downloaded_20210708.csv"
cdcList_20210708_v2 <- readRunCDCAllCause(loc=cdcLoc, 
                                          weekThru=23, 
                                          lst=readFromRDS("cdc_daily_210708"), 
                                          stateNoCheck=c("NC"), 
                                          pdfCluster=TRUE, 
                                          pdfAge=TRUE
                                          )
## 
## Parameter cvDeathThru has been set as: 2021-06-12 
## 
## 
##  *** Data suppression checks *** 
## 
## Rows in states to be checked that have NA deaths or a note for suppression:
##    state weekEnding year week                age
## 1     CT 2021-06-12 2021   23        45-64 years
## 2     CT 2021-06-12 2021   23        65-74 years
## 3     CT 2021-06-12 2021   23        75-84 years
## 4     CT 2021-06-12 2021   23 85 years and older
## 5     DE 2021-06-12 2021   23        65-74 years
## 6     DE 2021-06-12 2021   23        75-84 years
## 7     DE 2021-06-12 2021   23 85 years and older
## 8     WV 2021-06-05 2021   22        45-64 years
## 9     WV 2021-06-05 2021   22        65-74 years
## 10    AK 2021-05-08 2021   18        45-64 years
## 11    AK 2021-05-08 2021   18        65-74 years
##                                                   Suppress deaths
## 1  Suppressed (counts highly incomplete, <50% of expected)     NA
## 2  Suppressed (counts highly incomplete, <50% of expected)     NA
## 3  Suppressed (counts highly incomplete, <50% of expected)     NA
## 4  Suppressed (counts highly incomplete, <50% of expected)     NA
## 5  Suppressed (counts highly incomplete, <50% of expected)     NA
## 6  Suppressed (counts highly incomplete, <50% of expected)     NA
## 7  Suppressed (counts highly incomplete, <50% of expected)     NA
## 8  Suppressed (counts highly incomplete, <50% of expected)     NA
## 9  Suppressed (counts highly incomplete, <50% of expected)     NA
## 10 Suppressed (counts highly incomplete, <50% of expected)     NA
## 11 Suppressed (counts highly incomplete, <50% of expected)     NA
## 
## 
## Problems by state:
## # A tibble: 5 x 5
##   noCheck state problem     n deaths
##   <lgl>   <chr> <lgl>   <int>  <dbl>
## 1 FALSE   AK    TRUE        2     NA
## 2 FALSE   CT    TRUE        4     NA
## 3 FALSE   DE    TRUE        3     NA
## 4 FALSE   WV    TRUE        2     NA
## 5 TRUE    NC    TRUE      114     NA
## 
## 
## There are 11 rows with errors; maximum for any given state is 4 errors
## 
## 
## Data suppression checks passed
## 
## 
## *** File has been checked for uniqueness by: state year week age 
## 
## Rows: 93,132
## Columns: 12
## $ fullState  <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "Ala~
## $ weekEnding <date> 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10, 2015-01-10~
## $ state      <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL",~
## $ year       <fct> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,~
## $ week       <int> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4,~
## $ age        <fct> Under 25 years, 25-44 years, 45-64 years, 65-74 years, 75-8~
## $ period     <fct> 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015-2019, 2015~
## $ Type       <chr> "Predicted (weighted)", "Predicted (weighted)", "Predicted ~
## $ Suppress   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ n          <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ deaths     <dbl> 25, 67, 253, 202, 272, 320, 28, 49, 256, 222, 253, 332, 26,~
## $ Note       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## 
## Check Control Levels and Record Counts for Processed Data:
## 
## 
## Checking variable combination: age 
## # A tibble: 6 x 4
##   age                    n n_deaths_na  deaths
##   <fct>              <dbl>       <dbl>   <dbl>
## 1 Under 25 years     10919           0  375951
## 2 25-44 years        13908           0  922283
## 3 45-64 years        17084           0 3615594
## 4 65-74 years        17072           0 3626546
## 5 75-84 years        17079           0 4479686
## 6 85 years and older 17070           0 5772387
## 
## 
## Checking variable combination: period year Type 
## # A tibble: 7 x 6
##   period    year  Type                     n n_deaths_na  deaths
##   <fct>     <fct> <chr>                <dbl>       <dbl>   <dbl>
## 1 2015-2019 2015  Predicted (weighted) 14364           0 2691176
## 2 2015-2019 2016  Predicted (weighted) 14443           0 2723213
## 3 2015-2019 2017  Predicted (weighted) 14408           0 2802027
## 4 2015-2019 2018  Predicted (weighted) 14400           0 2830376
## 5 2015-2019 2019  Predicted (weighted) 14414           0 2844003
## 6 2020      2020  Predicted (weighted) 14838           0 3432903
## 7 2021      2021  Predicted (weighted)  6265           0 1468749
## 
## 
## Checking variable combination: period Suppress 
## # A tibble: 3 x 5
##   period    Suppress     n n_deaths_na   deaths
##   <fct>     <chr>    <dbl>       <dbl>    <dbl>
## 1 2015-2019 <NA>     72029           0 13890795
## 2 2020      <NA>     14838           0  3432903
## 3 2021      <NA>      6265           0  1468749
## 
## 
## Checking variable combination: period Note 
## # A tibble: 10 x 5
##    period   Note                                           n n_deaths_na  deaths
##    <fct>    <chr>                                      <dbl>       <dbl>   <dbl>
##  1 2015-20~ <NA>                                       72029           0  1.39e7
##  2 2020     Data in recent weeks are incomplete. Only~ 13459           0  3.04e6
##  3 2020     Data in recent weeks are incomplete. Only~     5           0  1.24e2
##  4 2020     Data in recent weeks are incomplete. Only~   262           0  1.57e5
##  5 2020     Weighted numbers of deaths are 20% or mor~   280           0  6.00e4
##  6 2020     Weights may be too low to account for und~    10           0  5.95e3
##  7 2020     <NA>                                         822           0  1.73e5
##  8 2021     Data in recent weeks are incomplete. Only~  5822           0  1.38e6
##  9 2021     Data in recent weeks are incomplete. Only~    34           0  3.23e3
## 10 2021     Data in recent weeks are incomplete. Only~   409           0  8.16e4

## 
## *** File has been checked for uniqueness by: cluster year week

## 
## Plots will be run after excluding stateNoCheck states

## 
## Detailed cluster summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_cluster_2021w23.pdf

## 
## Returning plot outputs to the main log file

## Joining, by = "state"

## 
## Detailed age summary PDF file is available at: ./RInputFiles/Coronavirus/Plots/CDC_age_2021w23.pdf

## 
## Returning plot outputs to the main log file